Forecast Hospital cases using Prophet model using extra regressor variables from vaccine data for those who received 1st dose of vaccine in the middle age groups (18 - 40). 3 days of forecast.

Initial Data Cleaning

Vaccine data, Cases data from the hospital, and the waste water signal data has been loaded, cleaned, and then merged into one final dataframe, final_data.

The variables have been log transformed.

The response variable observed_census_ICU_p_acute_care has been renamed as y and the date has been renamed as ds to fit the prophet model.

The dataset is divided into train and test set. The test set consist of last 3 days of data.

Adding other variable as regressors to the model.

Fitting the Prophet Model.

Forecasting 3 days into future

Checking last 6 days of the forecast data

#>             ds     yhat yhat_lower yhat_upper
#> 407 2022-01-27 4.283055   3.109918   5.558322
#> 408 2022-01-28 4.316604   3.068336   5.613060
#> 409 2022-01-29 4.251261   3.049918   5.464974
#> 410 2022-01-30 4.552796   3.326777   5.754792
#> 411 2022-01-31 4.099320   2.924277   5.253621
#> 412 2022-02-01 4.236492   3.002532   5.478749

The plot of actual data and predicted data from Prophet forecast. The blue line is predicted data whereas the black dots are actual data.

Root Mean Squared Error on training data:

#> [1] 26.15753

MAPE on train set:

#> [1] 0.6031466

Standard deviation of the actual data

#> [1] 35.92902

Plots comparing actual data and predicted data

RMSE on test set

#> [1] 13.63988

MAPE on test set

#> [1] 0.1263809

Plots comparing actual data and predicted data

The error metrics are low when model is regressed against only those who received 2nd dose in the middle age groups.

Component plot showing yearly and weekly trend of the model.

The model trend shows a weak linear increase in cases until July/ August 2021 and then it shows a strong linear increases until January 2022. The extra regressor plot shows the additive effect of regressors and it shows that 2nd dose of vaccination in middle age groups remains high until April/May 2021 and then shows a sharp decrease in July 2021 and then remains pretty much low and steady. Weekly trend shows there are more hospitalizations on Tuesday and Thursday.